Knowledge Intensive Word Alignment with KNOWA

نویسندگان

  • Emanuele Pianta
  • Luisa Bentivogli
چکیده

In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies mostly on information contained in bilingual dictionaries. The performances of KNOWA are compared with those of GIZA++, a state of the art statistics-based alignment algorithm. The two algorithms are evaluated on the EuroCor and MultiSemCor tasks, that is on two English/Italian publicly available parallel corpora. The results of the evaluation show that, given the nature and the size of the available English-Italian parallel corpora, a language-resource-based word aligner such as KNOWA can outperform a fully statistics-based algorithm such as GIZA++.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge-lite extraction of multi-word units with language filters and entropy thresholds

In this paper two approaches to knowledge-lite terminology extraction are compared, both involving language filters which are used to remove ill-formed multi-word units (MWUs). A knowledge-lite approach entails swift portability to new languages and to new domains, which is difficult to achieve if knowledge-intensive resources such as grammars, parsers, taggers and lexicons are used. The two ap...

متن کامل

Guiding Statistical Word Alignment Models With Prior Knowledge

We present a general framework to incorporate prior knowledge such as heuristics or linguistic features in statistical generative word alignment models. Prior knowledge plays a role of probabilistic soft constraints between bilingual word pairs that shall be used to guide word alignment model training. We investigate knowledge that can be derived automatically from entropy principle and bilingu...

متن کامل

Multi-Word Expression-Sensitive Word Alignment

This paper presents a new word alignment method which incorporates knowledge about Bilingual Multi-Word Expressions (BMWEs). Our method of word alignment first extracts such BMWEs in a bidirectional way for a given corpus and then starts conventional word alignment, considering the properties of BMWEs in their grouping as well as their alignment links. We give partial annotation of alignment li...

متن کامل

Word Alignment with Synonym Regularization

We present a novel framework for word alignment that incorporates synonym knowledge collected from monolingual linguistic resources in a bilingual probabilistic model. Synonym information is helpful for word alignment because we can expect a synonym to correspond to the same word in a different language. We design a generative model for word alignment that uses synonym information as a regulari...

متن کامل

Knowledge Management of Knowledge Intensive Business Processes with Pka Method

In this article we tested a process-knowledge allocation (PKA) method on real knowledge intensive process. PKA method is based on optimal balance between employee knowledge structure and process structural indexes (degree of process lean-ity). We found out that is useful in knowledge intensive processes like new product development (NPD) process to reorganize it with activity cutting principle ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004